Skip to content

feat(worker): pass db_session to ProcessingState in processor#744

Open
thomasrockhu-codecov wants to merge 3 commits intotomhu/processing-state-db-pathfrom
tomhu/processing-passes-db-session
Open

feat(worker): pass db_session to ProcessingState in processor#744
thomasrockhu-codecov wants to merge 3 commits intotomhu/processing-state-db-pathfrom
tomhu/processing-passes-db-session

Conversation

@thomasrockhu-codecov
Copy link
Contributor

@thomasrockhu-codecov thomasrockhu-codecov commented Mar 10, 2026

Summary

  • Passes db_session to ProcessingState in process_upload(), activating the DB-backed state path added in PR 2a. The processor now writes state_id = PROCESSED to the database instead of Redis.
  • Removes the should_trigger_postprocessing check and direct finisher triggering from the processor. This orphaned-task recovery mechanism will be replaced by the gate key in PR 4a.
  • Cleans up unused imports (celery_app, upload_finisher_task_name).
  • Replaces mock-heavy orphaned-task tests with a simpler test that verifies the upload's DB state is updated to PROCESSED after successful processing.

Stacked on #742.

Test plan

  • CI passes
  • test_successful_processing_marks_upload_as_processed verifies DB state transition
  • Existing processor behavior (error handling, intermediate report saving) unchanged

Made with Cursor


Note

Medium Risk
Touches upload processing state transitions and dual DB/Redis bookkeeping; mistakes could miscount uploads or trigger post-processing incorrectly, but changes are localized and covered by an updated test.

Overview
The processor now constructs ProcessingState with db_session, so successful processing persists Upload.state_id = PROCESSED in the DB while still dual-writing Redis sets for tracking.

ProcessingState tightens Redis cleanup in DB-backed paths (e.g., always srem from the processing set on clear, and continues Redis moves on mark_upload_as_processed even when DB is updated).

Tests drop the mock-heavy orphaned-finisher scenarios and replace them with a single assertion that a successful process_upload updates the upload’s DB state_id to PROCESSED (without changing the string state).

Written by Cursor Bugbot for commit b0374a1. This will update automatically on new commits. Configure here.

@thomasrockhu-codecov thomasrockhu-codecov force-pushed the tomhu/processing-passes-db-session branch from d2d6c17 to 7e53294 Compare March 10, 2026 14:12
@thomasrockhu-codecov thomasrockhu-codecov force-pushed the tomhu/processing-state-db-path branch 2 times, most recently from 6ce1a54 to 9c656c6 Compare March 10, 2026 14:19
@thomasrockhu-codecov thomasrockhu-codecov force-pushed the tomhu/processing-passes-db-session branch from 7e53294 to 010a532 Compare March 10, 2026 14:19
@thomasrockhu-codecov thomasrockhu-codecov marked this pull request as ready for review March 10, 2026 14:23
@thomasrockhu-codecov thomasrockhu-codecov force-pushed the tomhu/processing-state-db-path branch from 9c656c6 to e0b3bdf Compare March 10, 2026 14:30
@thomasrockhu-codecov thomasrockhu-codecov force-pushed the tomhu/processing-passes-db-session branch from 010a532 to a5d77c0 Compare March 10, 2026 14:30
@thomasrockhu-codecov thomasrockhu-codecov force-pushed the tomhu/processing-passes-db-session branch 2 times, most recently from 700f40b to 39bd0c9 Compare March 10, 2026 15:03
@thomasrockhu-codecov thomasrockhu-codecov force-pushed the tomhu/processing-state-db-path branch from e0b3bdf to 696d952 Compare March 10, 2026 15:10
@thomasrockhu-codecov thomasrockhu-codecov force-pushed the tomhu/processing-passes-db-session branch from 39bd0c9 to f43f178 Compare March 10, 2026 15:10
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Activate the DB-backed state path in process_upload() by passing
db_session to ProcessingState. The processor now writes PROCESSED
state to the database instead of Redis.

Also removes the should_trigger_postprocessing check and direct
finisher triggering from the processor -- this orphaned-task recovery
will be replaced by the gate key mechanism in a later PR.

Made-with: Cursor
@thomasrockhu-codecov thomasrockhu-codecov force-pushed the tomhu/processing-state-db-path branch from 696d952 to 3b30446 Compare March 10, 2026 15:20
@thomasrockhu-codecov thomasrockhu-codecov force-pushed the tomhu/processing-passes-db-session branch from f43f178 to 1361510 Compare March 10, 2026 15:21
The DB-backed path was skipping Redis writes, but the finisher still
reads from Redis. Keep writing to both until the finisher migrates
to DB-backed state in a later PR.

Made-with: Cursor
@thomasrockhu-codecov thomasrockhu-codecov force-pushed the tomhu/processing-passes-db-session branch from 1361510 to 81eb7d9 Compare March 10, 2026 15:33
Comment on lines 130 to 132
def mark_uploads_as_processing(self, upload_ids: list[int]):
if not upload_ids or self._db_session:
if not upload_ids:
return
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Removing the self._db_session guard in mark_uploads_as_processing causes task retries to corrupt Redis state, preventing the finisher task from being triggered.
Severity: CRITICAL

Suggested Fix

Restore the self._db_session check in the guard condition within mark_uploads_as_processing. The line should be if not upload_ids or self._db_session: return. This ensures the function is a no-op for Redis when a database session is provided, preventing state corruption during retries.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: apps/worker/services/processing/state.py#L130-L132

Potential issue: In a task retry scenario where a task fails after an upload is marked
as processed, the `mark_uploads_as_processing` function is called again. Due to the
removal of a check for `self._db_session`, this function re-adds the already processed
upload ID to the Redis "processing" set. This results in the upload ID existing in both
the "processing" and "processed" Redis sets. Consequently, a subsequent safety check,
`should_trigger_postprocessing`, incorrectly determines that processing is not complete,
which prevents the finisher task from running and leaves the commit in a stuck state.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pre-existing behavior identical to main branch. mark_uploads_as_processing always wrote to Redis on main (no db_session guard existed). This PR restores that same behavior for dual-write. Not a new regression.

@sentry
Copy link

sentry bot commented Mar 10, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 92.24%. Comparing base (3b30446) to head (b0374a1).
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@                        Coverage Diff                         @@
##           tomhu/processing-state-db-path     #744      +/-   ##
==================================================================
- Coverage                           92.25%   92.24%   -0.01%     
==================================================================
  Files                                1304     1304              
  Lines                               47949    47950       +1     
  Branches                             1628     1628              
==================================================================
  Hits                                44233    44233              
- Misses                               3407     3408       +1     
  Partials                              309      309              
Flag Coverage Δ
workerintegration 58.56% <100.00%> (+0.03%) ⬆️
workerunit 90.33% <100.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@codecov-notifications
Copy link

codecov-notifications bot commented Mar 10, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ All tests successful. No failed tests found.

📢 Thoughts on this report? Let us know!

When db_session is present, both the DB path and the Redis fall-through
path were incrementing CLEARED_UPLOADS. Move the Redis srem inside the
DB block and return early so the metric is only counted once.

Made-with: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant